首页> 外文OA文献 >Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

【2h】

Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

机译：推动随机梯度向二阶方法推进 - 非线性变换的反向传播学习

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Recently, we proposed to transform the outputs of each hidden neuron in amulti-layer perceptron network to have zero output and zero slope on average,and use separate shortcut connections to model the linear dependencies instead.We continue the work by firstly introducing a third transformation to normalizethe scale of the outputs of each hidden neuron, and secondly by analyzing theconnections to second order optimization methods. We show that thetransformations make a simple stochastic gradient behave closer to second-orderoptimization methods and thus speed up learning. This is shown both in theoryand with experiments. The experiments on the third transformation show thatwhile it further increases the speed of learning, it can also hurt performanceby converging to a worse local optimum, where both the inputs and outputs ofmany hidden neurons are close to zero.

机译：最近，我们提出将多层感知器网络中每个隐藏神经元的输出转换为平均具有零输出和零斜率的方法，并使用单独的快捷方式连接来建模线性依赖关系。标准化每个隐藏神经元输出的规模，其次通过分析与二阶优化方法的联系。我们表明，该变换使简单的随机梯度的行为更接近于二阶优化方法，从而加快了学习速度。理论上和实验上都显示了这一点。第三次变换的实验表明，尽管它进一步提高了学习速度，但同时也会收敛到更差的局部最优值，这会损害性能，在局部最优值中，许多隐藏神经元的输入和输出都接近于零。

著录项

作者
Vatanen, Tommi; Raiko, Tapani; Valpola, Harri; LeCun, Yann;
展开▼
作者单位

展开▼
年度 2013
总页数
原文格式 PDF
正文语种 {"code":"en","name":"English","id":9}
中图分类

相似文献

外文文献
中文文献
专利

1. Structure-preserving stochastic Runge–Kutta–Nystr?m methods for nonlinear second-order stochastic differential equations with multiplicative noise [J] . Qiang Ma, Yuanwei Song, Wei Xiao, Advances in Difference Equations . 2019,第1期

机译：具有乘法噪声的非线性二阶随机微分方程的结构保留随机跳动-Kutta-NYSTRαM
2. Nonlinear optical properties of push pull molecules grafted onto chloromethylstyrene. Hyperpolarizabilities of first- and second-order obtained by PM3, AM1 and MNDO methods. Correlation of EFISH measures on side-chain polymer with quantum chemistry r [J] . Y. Daoudi, P. J. Bonifassi Journal of Molecular Structure. Theochem: Applications of Theoretical Chemistry to Organic, Inorganic and Biological Problems . 1998,第3期

机译：推挽分子接枝到氯甲基苯乙烯上的非线性光学性质。通过PM3，AM1和MNDO方法获得的一阶和二阶超极化率。 EFISH对侧链聚合物的测量与量子化学的相关性
3. Machine Learning Approximation Algorithms for High-Dimensional Fully Nonlinear Partial Differential Equations and Second-order Backward Stochastic Differential Equations [J] . Beck Christian, Weinan E., Jentzen Arnulf Journal of nonlinear science . 2019,第4期

机译：高维全非线性偏微分方程和二阶向后随机微分方程的机器学习近似算法
4. Pushing Stochastic Gradient towards Second-Order Methods - Backpropagation Learning with Transformations in Nonlinearities [C] . Tommi Vatanen, Tapani Raiko, Harri Valpola, International conference on neural information processing . 2013

机译：将随机梯度推向二阶方法-非线性转换的反向传播学习
5. Second-Order Methods for Stochastic and Nonsmooth Optimization. [D] . Keskar, Nitish Shirish. 2017

机译：随机和非平滑优化的二阶方法。
6. O-Space Imaging: Highly Efficient Parallel Imaging Using Second-Order Nonlinear Fields as Encoding Gradients with No Phase Encoding [O] . Jason P. Stockmann, Pelin Aksit Ciris, Gigi Galiana, -1

机译：O-Space Imaging：使用二阶非线性字段的高效并行成像作为编码梯度没有阶段编码
7. A conditional gradient method for an optimal control problem involving a class of nonlinear second-order hyperbolic partial differential equations [O] . Wu Z.S., Teo K.L. 1983

机译：涉及一类非线性二阶双曲型偏微分方程的最优控制问题的条件梯度方法

Pushing Stochastic Gradient towards Second-Order Methods -- Backpropagation Learning with Transformations in Nonlinearities

摘要

著录项

相似文献

相关主题

期刊订阅